K-Graphs: Selecting Top-k Data Sources for XML Keyword Queries
نویسندگان
چکیده
Existing approaches on XML keyword search mostly focus on querying over single data source. However, searching over hundreds or even thousands of (distributed) data sources by sequentially querying every single data source is extremely high cost, thus it can be impractical. In this paper, we propose an approach for selecting top-k data sources to a given query in order to avoid high cost of search in numerous, potentially irrelevant data sources. The proposed approach can efficiently select top-k mostly relevant data sources without querying over the data sources. We propose a ranking function for measuring the strength of correlation between keywords in a data source and summarize the data sources as keywords correlation graphs (K-Graphs). The top-k relevant data sources will be selected by estimating the relevance of corresponding K-Graphs to the query. Experimental results show that the approach achieves good performance with a variety of experimental parameters.
منابع مشابه
Processing XML Keyword Search by Constructing Effective Structured Queries
Recently, keyword search has attracted a great deal of attention in XML database. It is hard to directly improve the relevancy of XML keyword search because lots of keyword-matched nodes may not contribute to the results. To address this challenge, in this paper we design an adaptive XML keyword search approach, called XBridge, that can derive the semantics of a keyword query and generate a set...
متن کاملKeyword Proximity Search on XML Graphs
XKeyword provides efficient keyword proximity queries on large XML graph databases. A query is simply a list of keywords and does not require any schema or query language knowledge for its formulation. XKeyword is built on a relational database and, hence, can accommodate very large graphs. Query evaluation is optimized by using the graph’s schema. In particular, XKeyword consists of two stages...
متن کاملEfficient top-k algorithm for eXtensible Markup Language keyword search
The ability to compute top-k matches to eXtensible Markup Language (XML) queries is gaining importance owing to the increasing of large XML repositories. Current work on top-k match to XML queries mainly focuses on employing XPath, XQuery or NEXI as the query language, whereas little work has concerned on top-k match to XML keyword search. In this study, the authors propose a novel two-layer-ba...
متن کاملImplementation of Efficient Keyword Routing in Linked Data
Keyword search is an intuitive paradigm for searching linked data sources on the web. We propose to route keywords only to relevant sources to reduce the high cost of processing keyword search queries over all sources. We propose a novel method for computing top-k routing plans based on their potentials to contain results for a given keyword query. We employ a keyword-element relationship summa...
متن کاملSAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents
Keyword search in XML documents has recently gained a lot of research attention. Given a keyword query, existing approaches first compute the lowest common ancestors (LCAs) or their variants of XML elements that contain the input keywords, and then identify the subtrees rooted at the LCAs as the answer. In this the paper we study how to use the rich structural relationships embedded in XML docu...
متن کامل